05 Root-mean-squared propagation (RMSProp)

RMSProp is an optimizer intended to address AdaGrad’s tendency to shrink the learning rate to zero. RMSProp adds an additional parameter to AdaGrad that effectively changes the state vector from a sum of past states to an exponential moving average of them. Larger values of extend the number of previous states that make a non-negligible contribution.

Recall that AdaGrad can be written as

To this, RMSProp adds scaling terms and to the first and second factors of respectively:

The update rule for remains the same as in AdaGrad:

where again is the element-wise product.

David's raw ML reference notes

Explorer

05 Root-mean-squared propagation (RMSProp)

Graph View

Backlinks